Oversampling and undersampling in data analysis について

Words near each other

・ Overstone
・ Overstone Park School
・ Overstone, Northamptonshire
・ Overstrand
・ Overstrand Hall
・ Overstrand Local Municipality
・ Overstrand railway station
・ Overruled!
・ Overrun
・ Overrun brake
・ Overrun Countries series
・ Overs (song)
・ Overs Piano
・ Oversampled binary image sensor
・ Oversampling
・ Oversampling and undersampling in data analysis
・ Overscan
・ Overschie
・ Overscreening
・ Oversea-Chinese Banking Corporation
・ Overseal
・ Overseas
・ Overseas (album)
・ Overseas (band)
・ Overseas Absentee Voting Act
・ Overseas Acehnese
・ Overseas administrative territorial entity
・ Overseas Adventure Travel
・ Overseas Association of College Admissions Counselors
・ Overseas Automotive Council

Dictionary Lists

mini英和辞書

翻訳と辞書　辞書検索 [ 開発暫定版 ]

スポンサードリンク

Oversampling and undersampling in data analysis ：ウィキペディア英語版

Oversampling and undersampling in data analysis

Oversampling and undersampling in data analysis are techniques used to adjust the class distribution of a data set (i.e. the ratio between the different classes/categories represented).
Oversampling and undersampling are opposite and roughly equivalent techniques. They both involve using a bias to select more samples from one class than from another.
The usual reason for oversampling is to correct for a bias in the original dataset. One scenario
where it is useful is when training a classifier using labelled training data from a biased source, since
labelled training data is valuable but often comes from un-representative sources.
For example, suppose we have a sample of 1000 people of which 66.7% are male (perhaps the sample was collected
at a football match). We know the general population is 50% female, and we may wish to adjust our dataset to represent this. Simple ''oversampling'' will select each female example twice, and this copying will produce a balanced dataset of 1333 samples with 50% female. Simple ''undersampling'' will drop some of the male samples at random to give a balanced dataset of 667 samples, again with 50% female.
There are also more complex oversampling techniques, including the creation
of artificial data points.
== See also ==

* Sampling (statistics)
* Oversampling in signal processing, which has no relation.

抄文引用元・出典: フリー百科事典『ウィキペディア（Wikipedia）』
■ウィキペディアで「Oversampling and undersampling in data analysis」の詳細全文を読む

スポンサードリンク

翻訳と辞書 : 翻訳のためのインターネットリソース